9 research outputs found

    Improving Term Extraction with Terminological Resources

    Full text link
    Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we implemented exploits testified terms at different steps of the process: chunking, parsing and extraction of term candidates. Experiments reported here show that, using this method, more term candidates can be acquired with a higher level of reliability. We further describe the extraction process involving endogenous disambiguation implemented in the term extractor YaTeA

    Algorithm for Adapting Cases Represented in a Tractable Description Logic

    Full text link
    Case-based reasoning (CBR) based on description logics (DLs) has gained a lot of attention lately. Adaptation is a basic task in the CBR inference that can be modeled as the knowledge base revision problem and solved in propositional logic. However, in DLs, it is still a challenge problem since existing revision operators only work well for strictly restricted DLs of the \emph{DL-Lite} family, and it is difficult to design a revision algorithm which is syntax-independent and fine-grained. In this paper, we present a new method for adaptation based on the DL EL⊥\mathcal{EL_{\bot}}. Following the idea of adaptation as revision, we firstly extend the logical basis for describing cases from propositional logic to the DL EL⊥\mathcal{EL_{\bot}}, and present a formalism for adaptation based on EL⊥\mathcal{EL_{\bot}}. Then we present an adaptation algorithm for this formalism and demonstrate that our algorithm is syntax-independent and fine-grained. Our work provides a logical basis for adaptation in CBR systems where cases and domain knowledge are described by the tractable DL EL⊥\mathcal{EL_{\bot}}.Comment: 21 pages. ICCBR 201

    Beyond representing orthology relations by trees

    Get PDF
    Reconstructing the evolutionary past of a family of genes is an important aspect of many genomic studies. To help with this, simple relations on a set of sequences called orthology relations may be employed. In addition to being interesting from a practical point of view they are also attractive from a theoretical perspective in that e.\,g.\,a characterization is known for when such a relation is representable by a certain type of phylogenetic tree. For an orthology relation inferred from real biological data it is however generally too much to hope for that it satisfies that characterization. Rather than trying to correct the data in some way or another which has its own drawbacks, as an alternative, we propose to represent an orthology relation δ\delta in terms of a structure more general than a phylogenetic tree called a phylogenetic network. To compute such a network in the form of a level-1 representation for δ\delta, we formalize an orthology relation in terms of the novel concept of a symbolic 3- dissimilarity which is motivated by the biological concept of a ``cluster of orthologous groups'', or COG for short. For such maps which assign symbols rather that real values to elements, we introduce the novel {\sc Network-Popping} algorithm which has several attractive properties. In addition, we characterize an orthology relation δ\delta on some set XX that has a level-1 representation in terms of eight natural properties for δ\delta as well as in terms of level-1 representations of orthology relations on certain subsets of XX

    Clustering Multi-Represented Objects with Noise

    No full text
    Traditional clustering algorithms are based on one representation space, usually a vector space. However, in a variety of modern applications, multiple representations exist for each object. Molecules for example are characterized by an amino acid sequence, a secondary structure and a 3D representation. In this paper, we present an e#cient density-based approach to cluster such multi-represented data, taking all available representations into account. We propose two di#erent techniques to combine the information of all available representations dependent on the application. The evaluation part shows that our approach is superior to existing techniques
    corecore